an  JFrattcfew  Cjjnmtde 


•••••• 


sfgate.com 


TUESDAY,  NOVEMBER  22,  2005 


415-777-1111       46c  plus  tax 


A  MAN'S  VISION:  WORLD  LIBRARY  ONLINE 

Brewster  Kahle  hopes  to  realize  his  25-year  dream  of  an  international  book  archive 


By  Heidi  Benson 
Chronicle  Staff  Writer 


John  Storey  /The  Chronicle 

Brewster  Kahle  introduces  a  scanner  to  digitize  books  and  plans 
for  a  digital  library  at  an  event  at  the  Presidio's  Golden  Gate  Club. 


A  fat  October  moon  shone 
through  the  Presidio  treetops  the 
night  Brewster  Kahle  launched 
the  latest  shot  in  the  space-race  for 
a  digital  library. 

"Lefs  get  the  people's  books 
back  to  the  people!"  said  Kahle, 
standing  at  the  podium  inside  the 
Golden  Gate  Club. 

Founder  of  the  Internet  Ar- 


chive, Kahle  is  an  ebullient  tech- 
nology visionary  of  the  type 
Northern  California  cultivates.  He 
has  been  widely  recognized  as  a 
digital  guru  and  a  catalyst  for 
change. 

Now,  his  vision  is  helping 
shape  the  debate  over  how  a  book 
library  should  reside  on  the  Inter- 
net. His  idealistic  yet  pragmatic 
approach  —  providing  free  digital 
access  to  works  in  the  public  do- 
main —  could  be  a  bridge  to  de- 


tente in  the  war  between  publish- 
ers and  Google  Inc. 

While  Google  has  alienated  au- 
thors and  publishers  with  its  plan 
to  digitize  books  still  in  copyright, 
Kahle  has  moved  gingerly,  forg- 
ing collaborations  with  Google's 
fiercest  archrivals  —  Microsoft 
and  Yahoo  —  to  create  a  kinder, 
gentler  digital  library  effort  called 
the  Open  Content  Alliance. 

The  alliance,  focused  on  books 
no  longer  under  copyright  —  that 


is,  books  published  before  1923  — 
echoes  the  computer  industry's 
open  source  movement,  which 
has  sought  to  spur  innovation  by 
enabling  software  engineers  to 
freely  share  their  code. 

Google's  library  initiative,  the 
Google  Print  Library  Project, 
which  has  plans  to  digitize  books 
from  the  collections  of  their  part- 
ner libraries  (the  New  York  Public 
Library  plus  the  libraries -of  Ox- 
►  KAHLE:  Page  A12 


VISION  FOR  ONLINE  LIBRARY  OF  WORLD'S  BOOKS 


►  KAHLE 

From  Page  Al 

ford  University,  Harvard,  Stanford 
and  the  University  of  Michigan)  — 
including  many  books  still  in 
copyright  —  has  earned  the  ire  of 
authors  and  publishers. 

"Google  is  building  a  database 
of  value  that  was  created  by  au- 
thors and  publishers  and  using  it 
to  advance  the  interests  of  its  reve- 
nue-generating, for-profit  search- 
engine  operation,"  Allan  Adler, 
vice  president  for  legal  and  gov- 
ernment affairs  of  the  Association 
of  American  Publishers,  which 
has  filed  a  copyright  infringement 
suit,  told  The  Chronicle  by  phone 
from  New  York. 

■  ■■ 

The  launch  of  the  Open  Con- 
tent Alliance  was  like  a  step  back 
in  time  for  one  attendee. 

"I  had  a  feeling  of  being  back  in 
the  early  days  of  open  source  soft- 
ware —  where  everybody  was 
there  because  they  hated  Micro- 
soft," said  Paul  Duguid,  a  visiting 
scholar  at  UC  Berkeley's  School  of 
Information  Management  and 
Systems.  "This  was  the  un-Google 
meeting." 

That  night,  Kahle  unveiled  his 
new  book  scanner,  Scribe.  A  kind 
of  portable  darkroom,  it  looks  like 
a  black-draped  office  cubicle.  In- 
side, two  digital  cameras  peer 
down  on  a  book  held  in  a  V- 
shaped  glass  cradle. 

A  human  technician  turns  the 
pages  and  works  the  cameras  via 
foot  pedals.  (Automated  systems 
can  damage  precious  paper.)  The 
results  are  super-high  resolution 
photographs  at  a  cost  of  10  cents 
per  page. 

A  handful  of  Scribe  machines 
already  have  been  sent  to  the  li- 
brary of  the  University  of  Toronto, 
an  alliance  partner. 

But  more  revolutionary  than 
the  book  scanners  —  variations  of 
which  Stanford,  Google  and  oth- 
ers have  developed  —  is  the  tech- 
nology Kahle  has  tested  with  his 
Internet  Bookmobile,  which  en- 
ables scanned  books  to  be  printed 
out  and  bound  in  volumes  that 
faithfully  resemble  the  original. 

It  is  here  that  Kahle  and  the 
Open  Content  Alliance  have 
topped  Google  —  by  already  get- 
ting real  books  into  readers' 
hands,  even  in  book-starved  east- 
ern Africa. 

"It  doesn't  look  like  a  printout 
with  a  staple,"  Kahle  said.  "It 
doesn't  look  like  a  report.  It  looks 
like  a  book. 

"Maybe  I'm  old-fashioned,  but 
I  still  love  books." 

1  ■  ■ 


"My  interest  is  to 

build  the  great 

library.  ...  It  is  now 

technically  possible 

to  live  up  to  the 

dream  of  the 

Library  of 

Alexandria." 

Brewster  Kahle,  digital  archivist 


By  light  of  day,  the  parking  lot 
of  the  Internet  Archive's  head- 
quarters at  the  Presidio  hosts  a 
motley  array  of  vehicles,  includ- 
ing Kahle's  favorite  invention,  the 
Internet  Bookmobile,  a  green 
Ford  van  with  a  satellite  dish  on 
the  roof  and  a  printer  and  book- 
binding contraption  on  the  tail- 
gate. The  slightly  creaky  clap- 
board building,  built  in  1857  as  a 
military  residence  and  store,  has 
little  in  common  with  the  nearby 
sleek  campus  of  George  Lucas'  In- 
dustrial Light  &  Magic. 

"Where  are  the  machines?"  a 
visitor  might  ask.  The  Internet  Ar- 
chive's souped-up  servers  —  stor- 
ing petabytes  of  information  (1  pe- 
tabyte equals  100  million  pages)  — 
are  all  South  of  Market,  filling 
three  warehouses  to  the  rafters. 

Kahle's  journey  started  at  the 
Massachusetts  Institute  of  Tech- 
nology, where  he  studied  artificial 
intelligence.  After  graduating  in 
1982,  he  helped  start  a  company 
called  Thinlcing  Machines.  By 
1989,  he  had  invented  the  first 
electronic  publishing  system, 
WAIS  (Wide  Area  Information 
Server),  with  a  client  fist  that  in- 
eluded  the  White  House,  the  Gov- 
ernment Printing  Office,  both 
houses  of  Congress,  the  Wall 
Street  Journal  and  the  New  York 
Times. 

The  company  was  acquired  by 
AOL  in  1995,  and  Kahle  decamp- 
ed for  San  Francisco,  where  he 
started  the  nonprofit  Internet  Ar- 
chive in  1996  to  serve  as  a  perma- 
nent archive  of  digital  work  — 
Web  pages,  music,  books,  software 
programs  —  available  free  to 
scholars  and  researchers.  That 
year  he  also  started  a  for-profit 
arm,  Alexa  Internet,  a  tool  for 
crawling  the  Web,  which  he  sold 
to  Amazon.com  in  1999. 

"My  interest  is  to  build  the 
great  library,"  said  Kahle,  perch- 
ing briefly  in  the  conference  room 
of  the  Internet  Archive  shortly  be- 
fore the  alliance  event.  "That  was 


the  goal  I  set  for  myself  25  years 
ago.  It  is  now  technically  possible 
to  five  up  to  the  dream  of  the  Li- 
brary of  Alexandria." 

That  storied  institution  on  the 
Nile  delta  housed  all  the  world's 
knowledge  until  its  mysterious  de- 
struction 1,600  years  ago. 

"Folks  are  using  the  Internet  as 
a  library,  and  they're  using  it 
many  times  every  day,"  Kahle  con- 
tinued. "We're  seeing  much  more 
traffic  on  the  Internet  then  we  ev- 
er did  in  our  public  library  system, 
but  what's  available  on  the  Inter- 
net isn't  the  best  we  have  to  offer. 
Almost  everything  on  the  Internet 
has  been  written  since  1996  —  and 
most  of  it  has  been  written  for  the 
Internet."  Kahle's  dream  is  to  col- 
lect online  the  great  books  on 
which  modern  civilization  is 
based. 

"Do  you  know  whafs  carved 
above  the  Carnegie  Library  in 
Pittsburgh?  -  'FREE  TO  THE 
PEOPLE'  -  what  a  goal!"  Kahle 
said.  "I  can  believe  in  this!  At  the 
Internet  Archive,  we  think  of  our 
mission  as  'universal  access  to  all 
knowledge.' 

"That  should  be  carved  over 
our  door." 


Early  this  year,  Kahle  was  in 
talks  with  Yahoo's  vice  president 
for  search  technology,  David 
Mandelbrot,  and  Sumir  Meghani, 
business  development  manager  of 
the  Sunnyvale  Internet  company. 

"We  wanted  to  figure  out  how 
the  nonprofit  sector  could  work 
with  the  commercial  sector,"  Kah- 
le said.  The  subject  of  "a  digital  li- 
brary of  Alexandria"  just  naturally 
came  up. 

Yahoo  proposed  creating  a 
freely  accessible  digital  library 
that  would  include  only  books  in 
the  public  domain. 

"After  that,  it  was  easy  to  know 
how  to  proceed,"  Kahle  said, 

It  was  agreed  that  Yahoo  would 
supply  the  search  engine  for  the 
Web  site  and  index  the  books 
scanned  by  the  Internet  Archive's 
Scribe  machines. 

The  Open  Content  Alliance 
was  born.  By  October,  an  impres- 
sive group  of  libraries  and  pub- 
lishers had  promised  to  partici- 
pate, including  the  Smithsonian, 
Johns  Hopkins  University,  Univer- 
sity of  Toronto,  British  National 
Archives,  European  Archives, 
O'Reilly  Media  and  Prelinger  Ar- 
chives plus  multimedia  compa- 
nies LibriVox,  Octavo  and  others. 

The  University  of  California  al- 
ready has  started  its  contribution: 
a  collection  of  18,000  works  of 
American  fiction,  which  librarians 
are  selecting  from  the 
10-library  statewide  system.  Mi- 


crosoft's  MSN  Search  has  prom- 
ised $5  million  toward  the  scan- 
ning of  150,000  books,  and  both 
Adobe  and  Hewlett-Packard  will 
contribute  advanced  digital  imag- 
ing. 

Kahle  hopes  to  have  "a  couple 
of  great  collections  up  on  the  Web 
by  the  end  of  2006." 

■  ■  ■ 

The  Google  Print  Library  Proj- 
ect differs  from  Kahle's  in  an  im- 
portant way:  Google  is  creating 
not  a  library  but  a  vast  electronic 
card  catalog. 

"We  have  been  very  clear  that 
we  want  to  build  a  book-finding 
tool,  not  a  book-reading  tool,"  said 
Jim  Gerber  of  Google  Print. 

"Even  before  we  started  Goo- 
gle, we  dreamed  of  making  the  in- 
credible breadth  of  information 
that  librarians  so  lovingly  orga- 
nize searchable  online,"  co- 
founder  Larry  Page  told  author 
David  A.  Vise  in  his  new  book, 
"The  Google  Story,"  out  this 
month  from  Delacorte. 

Back  in  October  2004,  they 
took  the  first  step,  announcing  the 
Google  Print  program  at  the  an- 
nual Frankfurt  Book  Fair.  It  would 
allow  viewers  to  search  books  on- 
line —  but  not  scan  or  print  them 
out  —  based  on  agreements  with 
publishers.  A  similar  project,  Ama- 
zon's free  "Search  Inside  the 
Book,"  had  already  proved  to 
boost  book  sales. 

Since  Google's  main  source  of 
revenue  is  its  signature  all-text 
ads,  which  are  linked  by  topic  but 
are  separate  from  searched  con- 
tent, that  model  would  be  repeat- 
ed. Google  and  the  publishers 
would  split  the  proceeds. 

But  this  summer,  at  the  2005 
Frankfurt  Book  Fair,  publishers 
and  authors  were  bristling  over 
Google's  most  recent  announce- 
ment. 

A  new  project  —  Google  Print 
Library  —  was  set  to  begin  digitiz- 
ing library  books,  including  many 
still  under  copyright.  Only  snip- 
pets of  text  would  be  viewable,  so 
Google  claimed  this  was  fair  use. 
Google  also  considered  itself  un- 
der no  obligation  to  ask  copyright 
holders'  permission  before  scan- 
ning books. 

Publishers  saw  it  differently: 
Because  entire  books  would  be 
digitized  to  provide  such  snippets, 
they  feared  piracy  —  and  the  dam- 
age that  free  file-sharing  has  done 
to  the  music  industry.  Also,  pub- 
Ushers  were  wary  of  Google  hav- 
ing the  biggest  online  library  in 
the  world  at  its  disposal  when,  in 
the  future,  copyright  law  changes 
to  adapt  to  the  Internet  age. 

In  August,  the  3,000-member 
Authors  Guild  sued  Google  to 


1     John  Storey  /The  Chronicle 

The  Internet  Archive's  digital  library  project  has  its  headquarters 
at  the  Presidio  in  a  building  built  as  housing  and  a  store  in  1857. 


cease  and  desist;  the  American  As- 
sociation of  Publishers,  with  300 
members,  sued  for  copyright  in- 
fringement; and  PEN  USA  and  the 
International  Publishers  Associa- 
tion issued  a  joint  declaration  call- 
ing the  Google  Library  Project  "in 
breach  of  existing  copyright  law." 

In  response,  Google  suspended 
book  scanning  for  three  months  to 
give  authors  and  publishers  time 
to  be  excluded  if  they  feared  pi- 
racy. 

"Early  on  in  the  discussions 
about  Google  Print,  that  was  one 
of  the  fears  noted  most  regularly," 
Gerber  said.  "Frankly,  that's  part 
of  the  reason  we  changed  our  pol- 
icy. That's  the  purpose  of  our  ex- 
clusion option." 

On  Nov.  1,  Google  resumed 
scanning  books.  Just  two  days  lat- 
er, Google  Print  announced  the 
availability  of  its  first  large  collec- 
tion of  books  —  all  in  the  public 
domain.  And  this  week,  a  strategic 
name  change  was  announced. 
Google  Print  is  now  Google  Book 
Search.  A  posting  on  Thursday  by 
Jen  Grant,  product  marketing 
manager,  said:  "Why  the  change? 
Well,  one  factor  was  all  the  com- 
ments we  got  about  how  excited 
people  were  that  Google  Print 
would  help  them  print  out  their 
documents,  or  Web  pages  they  vis- 
it —  which  of  course  it  won't." 

Meanwhile,  in  anticipation  of 
the  new  digital  marketplace,  two 
other  companies  scrambled  to  ac- 
commodate fee-based  online 
viewing.  Amazon  announced  Am- 
azon Pages  (unlike  "Search  Inside 
the  Book,"  a  fee  will  be  charged 
for  page  viewing  of  certain  books). 
And,  separately,  one  of  the  na- 
tion's largest  publishers,  Random 
House,  set  a  price  for  future  trans- 
actions —  4  cents  per  page  for 
viewing  more  than  5  percent  of  a 
book 


"Brewster  Kahle  is  an  activist, 


not  an  empire-builder,"  said  Paul 
Saffo  of  the  Palo  Alto-based  Insti- 
tute of  the  Future. 

"What  I've  always  admired 
about  Brewster  Kahle  is  his  atti- 
tude —  'lef  s  get  the  job  done  and 
find  out  what  the  wrinkles  are,'  " 
said  UC's  Duguid. 

"If  they  would  team  up  —  with 
Google's  strength  and  Kahle's  phi- 
losophy," Duguid  mused,  "that 
would  be  great." 

When  asked  whether  an  associ- 
ation with  Open  Content  Alliance 
was  in  the  works,  Google  spokes- 
man Nate  Tyler  said,  "We  are  talk- 
ing to  them,  but  there's  nothing  to 
announce  yet."  More  than  once, 
Kahle  has  expressed  his  desire  to 
see  Google  join  forces  with  the 
project  in  some  capacity. 

Before  taking  the  podium  at 
the  Presidio,  Kahle  told  a  visitor, 
"I  applaud  Google's  efforts. 
They've  got  a  bold  vision.  But 
their  approach  seems  to  have 
caused  lawsuits. 

"C'mon,  guys!  Let's  get  the 
businesspeople  back  at  the  table, 
and  send  the  lawyers  back  to  their 
cubicles!" 

Circulating  in  the  crowd  that 
night  was  Kahle's  wife,  Mary  Aus- 
tin, founder  of  the  San  Francisco 
Center  for  the  Book,  and  their  two 
sons.  It  was  the  end  of  a  long  day. 
The  next  morning,  the  family  was 
set  to  fly  to  China,  where  Kahle 
would  address  an  international 
conference  on  digital  libraries. 

"If  we  do  this  right,  it  will  be 
remembered  as  one  of  the  gTeat 
things  humans  have  done,  up 
there  with  the  Library  of  Alexan- 
dria, Gutenberg's  press  and  put- 
ting a  man  on  the  moon,"  Kahle 
said  in  closing.  "We're  going  step 
by  step  —  first,  lefs  see  if  we  can 
get  the  technology  right  so  that 
you'll  actually  want  to  see  a  book 
on  a  screen." 

E-mail  Heidi  Benson  at 
hbenson@sfchronicle.com. 


